Search CORE

43 research outputs found

An adaptivity hierarchy theorem for property testing

Author: Canonne Clément L.
Gur Tom
Publication venue: Schloss Dagstuhl--Leibniz-Zentrum fuer Informatik
Publication date: 21/07/2017
Field of study

Adaptivity is known to play a crucial role in property testing. In particular, there exist properties for which there is an exponential gap between the power of adaptive testing algorithms, wherein each query may be determined by the answers received to prior queries, and their non-adaptive counterparts, in which all queries are independent of answers obtained from previous queries. In this work, we investigate the role of adaptivity in property testing at a finer level. We first quantify the degree of adaptivity of a testing algorithm by considering the number of "rounds of adaptivity" it uses. More accurately, we say that a tester is k-(round) adaptive if it makes queries in k+1 rounds, where the queries in the i'th round may depend on the answers obtained in the previous i-1 rounds. Then, we ask the following question: Does the power of testing algorithms smoothly grow with the number of rounds of adaptivity? We provide a positive answer to the foregoing question by proving an adaptivity hierarchy theorem for property testing. Specifically, our main result shows that for every n in N and 0 <= k <= n^{0.99} there exists a property Pi_{n,k} of functions for which (1) there exists a k-adaptive tester for Pi_{n,k} with query complexity tilde O(k), yet (2) any (k-1)-adaptive tester for Pi_{n,k} must make Omega(n) queries. In addition, we show that such a qualitative adaptivity hierarchy can be witnessed for testing natural properties of graphs

Warwick Research Archives Portal Repository

Generalized uniformity testing

Author: Batu Tugkan
Canonne Clément L.
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 15/08/2017
Field of study

In this work, we revisit the problem of uniformity testing of discrete probability distributions. A fundamental problem in distribution testing, testing uniformity over a known domain has been addressed over a significant line of works, and is by now fully understood. The complexity of deciding whether an unknown distribution is uniform over its unknown (and arbitrary) support, however, is much less clear. Yet, this task arises as soon as no prior knowledge on the domain is available, or whenever the samples originate from an unknown and unstructured universe. In this work, we introduce and study this generalized uniformity testing question, and establish nearly tight upper and lower bound showing that – quite surprisingly – its sample complexity significantly differs from the known-domain case. Moreover, our algorithm is intrinsically adaptive, in contrast to the overwhelming majority of known distribution testing algorithms

arXiv.org e-Print Archive

Crossref

LSE Research Online

Private Distribution Testing with Heterogeneous Constraints: Your Epsilon Might Not Be Mine

Author: Canonne Clément L.
Sun Yucheng
Publication venue
Publication date: 13/09/2023
Field of study

Private closeness testing asks to decide whether the underlying probability distributions of two sensitive datasets are identical or differ significantly in statistical distance, while guaranteeing (differential) privacy of the data. As in most (if not all) distribution testing questions studied under privacy constraints, however, previous work assumes that the two datasets are equally sensitive, i.e., must be provided the same privacy guarantees. This is often an unrealistic assumption, as different sources of data come with different privacy requirements; as a result, known closeness testing algorithms might be unnecessarily conservative, "paying" too high a privacy budget for half of the data. In this work, we initiate the study of the closeness testing problem under heterogeneous privacy constraints, where the two datasets come with distinct privacy requirements. We formalize the question and provide algorithms under the three most widely used differential privacy settings, with a particular focus on the local and shuffle models of privacy; and show that one can indeed achieve better sample efficiency when taking into account the two different "epsilon" requirements

arXiv.org e-Print Archive

An adaptivity hierarchy theorem for property testing

Author: Canonne Clément L.
Gur Tom
Publication venue: Schloss Dagstuhl--Leibniz-Zentrum fuer Informatik
Publication date: 18/02/2017
Field of study

arXiv.org e-Print Archive

Warwick Research Archives Portal Repository

Apollo (Cambridge)

Lemmas of Differential Privacy

Author: Canonne Clément L.
Huang Yiyang
Publication venue
Publication date: 21/11/2022
Field of study

We aim to collect buried lemmas that are useful for proofs. In particular, we try to provide self-contained proofs for those lemmas and categorise them according to their usage.Comment: Comments, feedback, and suggested additions welcom

arXiv.org e-Print Archive

Robust Testing in High-Dimensional Sparse Models

Author: Canonne Clément L.
George Anand Jerry
Publication venue
Publication date: 04/11/2022
Field of study

We consider the problem of robustly testing the norm of a high-dimensional sparse signal vector under two different observation models. In the first model, we are given

n

i.i.d. samples from the distribution

\mathcal{N}\left(\theta,I_d\right)

(with unknown

\theta

), of which a small fraction has been arbitrarily corrupted. Under the promise that

\|\theta\|_0\le s

, we want to correctly distinguish whether

\|\theta\|_2=0

\|\theta\|_2>\gamma

, for some input parameter

\gamma>0

. We show that any algorithm for this task requires

n=\Omega\left(s\log\frac{ed}{s}\right)

samples, which is tight up to logarithmic factors. We also extend our results to other common notions of sparsity, namely,

\|\theta\|_q\le s

for any

0 < q < 2

. In the second observation model that we consider, the data is generated according to a sparse linear regression model, where the covariates are i.i.d. Gaussian and the regression coefficient (signal) is known to be

s

-sparse. Here too we assume that an

\epsilon

-fraction of the data is arbitrarily corrupted. We show that any algorithm that reliably tests the norm of the regression coefficient requires at least

n=\Omega\left(\min(s\log d,{1}/{\gamma^4})\right)

samples. Our results show that the complexity of testing in these two settings significantly increases under robustness constraints. This is in line with the recent observations made in robust mean testing and robust covariance testing.Comment: Fixed typos, added a figure and discussion sectio

arXiv.org e-Print Archive